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Abstract 


In  this  report,  a  Multi  Layer  Perceptron  (MLP)  Neural  Network  is  used  for  recognizing 
military  ground  vehicles  imaged  by  Synthetic  Aperture  Radar  (SAR).  In  particular,  the 
classifier  is  applied  to  SAR  images  taken  from  the  MSTAR  (Moving  and  Stationary  Target 
Acquisition  and  Recognition)  data  set,  which  has  been  made  available  to  the  public. 
Signatures  are  extracted  from  the  imagery  using  a  Fourier  Transform  method  and  features  are 
selected  to  feed  the  neural  network.  A  4-layer  (including  input  and  output  layers)  Neural 
Network  with  38  input  nodes,  13  first  hidden  nodes,  1 1  second  hidden  nodes  and  3  output 
nodes,  is  implemented  for  this  task.  Standard  delta  rule  back-propagation  algorithm  has  been 
used  to  train  the  neural  network.  The  MLP  neural  network  is  evaluated  according  to  the 
MSTAR  standard  evaluation  criteria.  Training  of  3  vehicle  classes  occurs  using  a  set  of  SAR 
images  at  a  17-degree  depression  angle  with  0-360  degree  azimuthal  angles,  while  the  testing 
set  contains  images  at  a  15-degree  depression  angle  with  0-360  degree  azimuthal  angles.  The 
testing  set  contains  both  target  vehicles  that  belong  to  the  3  trained  classes  and  confuser 
vehicles  that  do  not.  Results  of  MLP  neural  network  evaluation  are  shown  using  Receiver 
Operating  Characteristic  (ROC)  curves  and  Confusion  Matrices. 


Resume 


Dans  ce  rapport,  un  reseau  neuronal  perceptron  multicouches  est  utilise  pour  la 
reconnaissance  de  vehicules  terrestres  militaires  vus  par  un  radar  a  antenne  synthetique 
(RAS).  Plus  particulierement,  le  classificateur  est  applique  a  des  images  RAS  de  l’ensemble 
de  donnees  MSTAR  (Moving  and  Stationary  Target  Acquisition  and  Recognition  = 
acquisition  et  reconnaissance  de  cibles  mobiles  et  fixes),  qui  a  ete  rendu  public.  Les  signatures 
sont  extraites  des  images  au  moyen  d’une  methode  de  transformees  de  Fourier  et  des 
caracteristiques  sont  selectionnees  aux  fins  du  reseau  neuronal.  Un  reseau  neuronal  a  4 
couches  (y  compris  les  couches  d’entree  et  de  sortie)  avec  38  nceuds  d’entree,  13  premiers 
noeuds  caches,  1 1  seconds  nceuds  caches  et  3  nceuds  de  sortie,  est  mis  en  oeuvre  pour  cette 
tache.  Un  algorithme  de  retropropagation  du  gradient  a  regie  delta  standard  a  ete  utilise  pour 
1’entraTnement  du  reseau  neuronal.  Le  reseau  neuronal  perceptron  multicouches  est  evalue  en 
fonction  des  criteres  devaluation  standard  du  MSTAR.  L’entramement  pour  3  classes  de 
vehicules  se  fait  a  l’aide  d’un  ensemble  d’images  RAS  a  un  angle  de  depression  de  17  degres 
avec  des  angles  d’azimut  de  0  a  360  degres,  tandis  que  l’ensemble  d’essai  contient  des  images 
a  un  angle  de  depression  de  15  degres  avec  des  angles  d’azimut  de  0  a  360  degres. 

L’ensemble  d’essai  contient  a  la  fois  les  vehicules  qui  font  partie  des  3  classes  de  vehicules 
visees  et  des  vehicules  trompe-l’oeil  qui  n’en  font  pas  partie.  Les  resultats  de  revaluation  du 
reseau  neuronal  perceptron  multicouches  sont  montres  au  moyen  de  courbes  de  fonction 
d’efficacite  du  recepteur  et  de  grilles  de  correction. 
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Executive  summary 


Synthetic  Aperture  Radar  (SAR)  based  automated  target  recognition  (ATR)  system  requires  a 
fast  and  effective  classifier  to  discriminate  desired  types  of  targets  from  man-made  targets, 
natural  clutter  and  background  noise.  There  are  many  classifiers  existing  in  the  filed  of  ATR 
and  they  all  have  advantages  and  disadvantages.  The  Multi  Layer  Perceptron  (MLP)  Neural 
Network  has  been  successfully  used  in  other  applications  such  as  mining,  signal  processing, 
pattern  recognition  etc.  The  author  applied  an  MLP  Neural  Network  to  SAR  images  of 
military  vehicles,  and  showed  the  performance  result  on  the  MSTAR  public  data  set. 

The  MLP  Neural  Network  is  evaluated  using  Receiver  Operating  Characteristic  (ROC)  curves 
and  Confusion  Matrices  on  the  publicly  released  MSTAR  data  set.  The  results  of  these 
evaluations  are  listed  under  Results  and  Discussion  section  and  the  percent  of  correct 
classification  of  declared  targets  is  about  85%.  The  rate  of  correct  classification  of  declared 
target  can  be  improve  by  choosing  alternative  methods  for  feature  extraction  and  revisiting  the 
architecture  of  the  MLP  Neural  Network. 


Sandirasegaram  N.  (2002).  Automatic  Target  Recognition  in  SAR  Imagery  using  a  MLP 
Neural  Network.  DRDC  Ottawa  TM  2002-120.  Defence  R&D  Canada  -  Ottawa. 
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Sommaire 


Le  systeme  de  reconnaissance  de  cibles  automatisE  (RCA)  fonde  sur  le  radar  &  antenne 
synthEtique  (RAS)  nEcessite  un  classificateur  rapide  et  efficace  pour  Etablir  une  distinction 
entre  les  types  de  cibles  dEsirEs,  et  les  cibles  artificielles,  le  fouillis  d’echos  naturel  et  le  bruit 
de  fond.  II  existe  un  grand  nombre  de  classificateurs  dans  le  domaine  de  la  reconnaissance  de 
cibles  automatisEe,  et  ils  possEdent  tous  des  avantages  et  des  inconvEnients.  Le  rEseau 
neuronal  perceptron  multicouches  a  EtE  utilisE  avec  succEs  dans  d’autres  applications  telles 
que  l’exploitation  miniEre,  le  traitement  de  signaux,  la  reconnaissance  de  formes,  etc.  L’ auteur 
a  appliquE  un  reseau  neuronal  perceptron  multicouches  it  des  images  RAS  de  vehicules 
militaires,  et  a  montrE  le  rEsultat  de  performance  sur  l’ensemble  de  donnEes  public  MSTAR. 

Le  reseau  neuronal  perceptron  multicouches  est  Evalu E  au  moyen  de  courbes  de  fonction 
d’efficacitE  du  rEcepteur  et  de  grilles  de  correction  sur  l’ensemble  de  donnEes  public  MSTAR. 
Les  rEsultats  de  ces  Evaluations  sont  indiquEs  &  la  section  «  Results  and  Discussion  »  et  le 
pourcentage  de  classification  correcte  de  cibles  dEclarees  est  d’environ  85  %.  Le  taux  de 
classification  correcte  de  cibles  dEclarees  peut  etre  amEliorE  par  le  choix  de  methodes  de 
rechange  pour  1’extraction  de  caracteristiques  et  par  la  rEvision  de  l’architecture  du  rEseau 
neuronal  perceptron  multicouches. 


Sandirasegaram  N.  (2002).  Automatic  Target  Recognition  in  SAR  Imagery  using  a  MLP 
Neural  Network.  DRDC  Ottawa  TM  2002-120.  R&D  pour  la  defense  Canada  -  Ottawa. 
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1.  Introduction 


The  area  of  Automatic  Target  Recognition  (ATR)  for  SAR  imagery  is  an  ongoing  research  in 
many  braches  of  the  military  and  large  research  institutions  [1].  General  functions  of  ATR 
such  as  target  detection,  classification,  etc.  can  be  found  more  in  details  in  [2]  and  [3].  The 
U.S.  Defense  Advanced  Research  Projects  Agency  (DARPA)  has  made  part  of  the  Moving 
and  Stationary  Target  Acquisition  and  Recognition  (MSTAR)  data  set  available  to  the  public. 
The  MSTAR  public  data  set  contains  many  spotlight  SAR  vehicle  images  including  10  types 
of  former  Soviet  Union  vehicles  with  0  to  360  degrees  azimuthal  angle  and  the  depression 
angle  of  15  and  17degrees. 

The  COMPASE  Center  of  AFRL  developed  a  standard  MSTAR  evaluation  methodology  to 
evaluate  the  ATR  algorithms  using  the  MSTAR  public  data  set  [4],  The  standard  evaluation 
method  uses  Confusion  Matrix  and  Receiver  Operating  Characteristic  (ROC)  curves  to 
evaluate  the  algorithms  as  was  done  for  the  other  ATR  algorithms  [5,6,7]  (such  as  HNeT, 
template  matching,  etc.).  Here  this  method  is  also  used  to  evaluate  a  Neural  Network 
algorithm. 

Neural  Networks  have  been  successfully  applied  to  classification  problems  in  the  areas  of 
industry,  business  and  science  [8],  Neural  Networks  and  statistical  classifiers  have  been 
compared  in  different  applications  [9,  10,  11].  Statistical  methods  need  more  space  to  store 
all  the  training  data  and  also  they  often  work  very  slowly  compared  to  Neural  Network  (NN) 
classifier  [10],  Maximum  Likelihood  and  NN  classifiers  are  compared  by  Fauzi  et  al.  in 
characterizing  the  condition  of  logged  over  and  unlogged  tropical  rain  forest  using  satellite 
remotely  sensed  data  and  they  show  that  overall  accuracies  of  NN  is  better  than  Maximum 
Likelihood  [11],  Both  classifiers  are  applied  to  and  compared  for  the  same  data  (multisource 
remote  sensing  data)  by  Benediktsson  et  al.,  who  determined  a  three  layer  NN  is  more 
appropriate  in  multisource  classification  if  the  training  time  is  in  a  reasonable  amount  of  time 
[9],  However,  a  NN  has  a  weakness  when  the  size  of  the  training  samples  become  large,  the 
training  time  can  be  very  long  [9].  Additional  details  about  NN  technologies  for  ATR  can  be 
found  in  [12]. 

Neural  Networks  attempt  to  copy  abilities  of  biological  neurons  [13]  and  the  reader  can  find 
an  explanation  of  differences  between  biological  neuron  and  artificial  neuron  in  [13,  14,15]. 
The  first  NN  model  was  introduced  by  McCulloch  and  Pitts  [16, 17]  in  thel940s  and  is  still  an 
area  of  active  research  today.  A  NN  is  characterized  by  network  architecture,  node  properties, 
learning  rules  and  connections  between  neurons  [15].  The  reader  can  find  different  structured 
or  characterized  NNs  in  [13,15].  The  most  commonly  used  nonlinear  regression  and 
discriminant  model  is  the  Multi  Layer  Perceptron  NN  [18],  which  is  capable  of  learning 
nonlinear  function  mappings  [15].  Multi  Layer  Perceptron  (MLP)  NN  has  been  used 
extensively  in  various  problems  [19]  more  than  any  other  NN  [13].  The  typical  MLP  NN  is 
build  with  an  input  layer,  an  output  layer  and  at  least  one  hidden  layer  [13].  Here,  we 
evaluate  an  MLP  NN  ATR  algorithm  applied  to  the  MSTAR  public  data  set.  In  this  case,  a  4- 
layer  MLP  neural  network  is  implemented  with  38  input  nodes,  13  first  hidden  nodes,  1 1 
second  hidden  nodes  and  3  output  nodes.  The  standard  delta  rule  back-propagation  training 
method  is  used  to  train  the  implemented  neural  network,  allowing  it  to  learn  about  specific 
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vehicle  types  from  the  training  set  and  then,  when  the  testing  set  is  introduced,  the  neural 
network  is  able  to  predict  a  classification  based  on  the  knowledge  learned  from  the  training 
set. 

Preprocessing  of  the  imagery  begins  by  taking  a  64  x  64  pixel  block  from  the  center  of  each 
chip  target  for  Fourier  feature  extraction,  which  is  explained  in  Section  2.  Section  3  provides 
an  overview  of  the  MLP  NN  training  and  testing  algorithms.  The  MSTAR  data  set  used  for 
training  and  testing  is  described  in  the  section  4  and  the  evaluation  of  results  and  discussion 
are  given  in  section  5.  Finally,  the  conclusions  are  given  in  section  6. 


2.  Feature  Extraction 


Feature  extraction  is  a  necessary  step  in  the  classification  process.  It  is  a  preprocessing 
technique  to  standardize  information  provided  to  the  classifier.  In  addition,  it  increases  the 
training  speed  and  testing  speed  of  the  classifier,  as  well  as  reducing  the  size  of  the  training 
samples.  This  feature  extraction  method  is  inspired  by  the  one  used  by  HNeT  [5].  In  the 
MSTAR  problem,  there  are  three  classes  of  former  Soviet  Union’s  military  vehicles 
considered  and  they  are  BMP2,  BTR70  and  T72.  For  this  problem,  the  HNeT  classifier  is 
implemented  using  three  binary  classifiers,  one  for  each  class,  and  uses  256  features  for  each 
class.  But  here,  only  one  NN  classifier  is  implemented  for  all  the  three  classes  (multi 
classifier)  and  16  features  are  used  instead  of  256  features  for  each  class  in  the  HNeT.  That 
is,  a  total  of  48  features  for  all  the  three  classes.  The  normalizing  method  and  method  of 
selecting  best  16  invariant  features  are  followed  in  the  same  manner  as  in  the  selection  of  256 
features  in  the  HNeT  method. 

A  real-to-complex  Fourier  Transform  method  is  applied  to  the  pixel  magnitude  of  each  SAR 
image,  generating  a  set  of  Fourier  coefficients  to  be  used  as  our  feature  space.  Fourier 
Transform  coefficients  are  measures  of  periodicity.  To  reduce  the  computation  times  of  the 
Fourier  transform,  the  SAR  images  are  first  cropped  to  N  x  N  (N=64)  chips  (Fast  Discrete 
Fourier  Transform  (DFT)  is  faster  if  the  image  size  is  in  power  of  2).  Before  applying  the 
DFT  to  get  the  Fourier  coefficients,  the  chip  size  image  is  normalized  as  follows: 


t*  = 


N  N 

I  I/Uy) 

x  =  ly  =  1 
N  *  N 


(1) 


where  fJ,  is  mean  and  f(x,y)  is  image  pixel  magnitude  at  location  x  and  y. 


a  = 


N  N 

I  Uf(x,y)-MY 

x  =  \y-\ _ 


N*N 


where  <7  is  the  standard  deviation  of  the  image. 


(2) 
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The  mean  and  standard  deviation  are  calculated  as  shown  in  Eq.l  and  Eq.2.  Then  the  image  is 
normalized  as  follows, 


nf(x,y) 


f(x,y)-ju 

a 


1  <x,y<N , 


(3) 


where  nf(x,y)  is  the  normalized  image  value.  Effectively,  the  mean  is  set  to  zero  and  the 
standard  deviation  set  to  one  by  doing  the  above  normalization.  The  DFT  algorithm  is  applied 
to  the  normalized  image  according  to 


F(u,v)  =  — 
N 


N  N 

Z  Unfix, y)e-2^xu+yv),N , 

x  =  ly  =  1 


(4) 


where  F(u,v)  is  the  Fourier  transform  of  image.  The  Fourier  coefficients  from  (4)  are 
separated  into  4096  real  (RF(u,v))  coefficients  and  4096  imaginary  (IF(u,v))  coefficients. 
Then,  normalized  RF(u,v)  and  IF(u,v)  are  calculated  separately  by  replacing  f(x,y)  in  (1)  and 
(2)  with  RF(u,v)  and  IF(u,v).  Whereby  (3)  becomes 


and 


NRF(u,v) 


RF(u,v)~fireal 

areal 


NIF(u,v ) 


/ F(u,v )  Mjmag 
Gimag 


1  <  u,v  <  N  , 


1  <  H,  V  <  IV  . 


(5) 

(6) 


NRF(u,v)  and  NIF(u,v)  are  the  normalized  real  and  imaginary  Fourier  coefficients.  In  this 
way,  normalized  real  and  imaginary  coefficients  are  computed  for  all  the  training  samples. 
Since  there  are  8192  (4096  real  and  4096  imaginary  coefficients)  Fourier  coefficients,  if  we 
feed  these  coefficients  to  a  NN,  the  training  process  will  require  too  many  inputs,  will  impede 
the  generalization  capability  of  the  NN,  as  well  as  needlessly  consuming  time  and  computing 
resources.  Therefore,  a  selected  few  (16  for  each  class)  of  the  coefficients  are  retained.  The 
16  features  are  not  randomly  selected,  but  are  the  most  invariant  coefficients  in  that  particular 
class  of  image  compared  to  the  coefficients  of  other  classes.  To  get  these  16  features  for  each 
class,  all  the  training  samples’  normalized  coefficients  are  first  linearly  mapped  to  polar  angle 
[5]  between  0  and  7t .  The  linear  mapping  of  each  coefficient  is  given  by 

0k  (u,v)  =  Rconst(u,v)  (NFk(u,v)  -  mi(u,v)) ,  (7) 

where 

NFk  (u,v)  = 


ek (u,v)  = 


f  NRFk(u,v) 

1  Or  ,  (8) 

NIFk(u,v) 

R0k  (u,v) 

*  Or  ,  (9) 

w  I6k  (u,v) 
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and 


Rconst(u,v)  =  — 7 — r - 7 — r,  (10) 

ma{u,v)-mi{u,v) 

with 

mi(u,v)  =  min(NF k  (u> v)) «  (H) 

and 

ma(u,v)  =  max(NF/c(u,v)),  (12) 

subject  to  1  <  u,v  <  N  and  1  <  k  <  M  . 


R0k  (u,v)  and  I0k  (u,v)  are  the  real  and  imaginary  rescaled  polar  angles  and  Rconst(u,v)  is  a 
ratio  of  the  modified  range  to  the  original  range  of  the  real/imaginary  coefficients  at  (u,  v). 
mi(u,v)  and  ma(u,v)  are  the  minimum  and  the  maximum  values  of  the  real/imaginary 
coefficients  at  (u,v)  respectively.  M  is  the  number  of  training  samples  in  the  training  set. 

For  a  given  target  class,  each  training  sample  will  belong  to  one  of  two  groups,  the  in-class 
group  or  the  out-class  group.  Ideally,  the  16  features  to  be  selected  need  to  be  invariant  for  in- 
class  samples  and  random  valued  over  the  out-class  group.  To  measure  the  invariance  of  the 
in-class  features,  the  number  of  in-class  samples  should  be  equal  to  the  number  of  out-class 
samples.  If  the  number  of  samples  in  each  group  is  not  equal,  then  a  random  selection  of 
samples  from  the  smaller  group  is  added  to  the  samples  of  that  group,  thereby  making  the 
number  samples  in  both  groups  equal.  Call  this  amount  L.  A  measure  of  invariance  for  each 
Fourier  coefficient  can  then  be  calculated  according  to 


and 


where 

and 


RR(u,v ) 


lR(u,v) 


k.s{k)eiRdk^ 

k=  1 


'Ls(k)eJWkM 
k  =  1 


s(k)  = 


i  + 1  if  it  is  in  -  class 

l-l  if  it  is  out  -  class  ’ 


1  <  w,  v  <  N  . 


(13) 


(14) 

(15) 


The  16  longest  independent  polar  vector  lengths,  RR(u,v)  or  IR(u,v)  are  selected  for  the 
feature  set.  Redundancy  in  the  real  to  complex  Fourier  transform  means  that  each  coefficient 
appears  as  a  duplicate  pair.  Only  one  of  each  pair  is  retained  in  the  feature  set  for  each  class. 
Some  of  these  features  may  be  common  to  more  than  one  class,  while  others  are  not.  For  the 
3  MSTAR  vehicle  classes,  the  total  number  of  unique  selected  features  has  been  determined 
to  be  38,  less  than  the  maximum  possible  total  of  48  features  for  the  3  classes.  These  38 
features  are  fed  into  the  neural  network  after  being  set  between  -n /2  and  +  ;r/2  by  subtracting 
Jt! 2  from  the  calculated  polar  angle  (Eq.7). 


4 


DRDC  Ottawa  TM  2002-120 


3.  Implementation  of  MLP  Neural  Network 


Using  the  vector  of  extracted  features,  the  classifier  must  be  able  to  correctly  decide  whether 
each  image  is  a  known  target  or  an  unknown  target.  The  MLP  NN  is  able  to  make  an 
intelligent  decision  based  on  learning  the  sample  data.  The  MLP  NN  contains  many  layers 
and  nodes  that  are  connected  via  adjustable  weights.  By  manipulating  these  weights,  the  NN 
output  decisions  can  be  matched  to  the  desired  decisions  known  for  the  training  set.  Input  and 
output  layers  are  in  contact  with  the  outside  world,  while  the  hidden  layers  are  not  available 
for  outside  connection,  but  instead  are  connected  between  input  and  output  layer,  input  and 
another  hidden  layer,  two  hidden  layers  or  between  a  hidden  and  the  output  layer. 

For  application  to  the  MSTAR  problem,  an  MLP  NN  was  selected  with  two  hidden  layers:  13 
nodes  in  the  first  hidden  layer  and  1 1  nodes  in  the  second  hidden  layer,  as  shown  in  fig.l.  The 
number  of  layers  and  number  of  nodes  in  the  hidden  layers  were  decided  using  empirical 
testing  of  the  NN  and  these  are  not  optimized  numbers.  The  number  of  input  nodes  depends 
on  the  number  of  features  chosen  and,  as  we  have  chosen  38  features  (see  section  2.0),  38 
nodes  should  be  included  in  the  input  layers  of  the  NN.  Three  ground  vehicle  classes  (BMP- 
2,  BTR-70  and  T-72)  are  considered  in  this  task,  therefore  three  output  nodes  are  included  in 
the  output  layer. 


IoPu‘L*>'*r  OJputL.y«:r 


Figure  1.  Four  layer  Multi  Layer  Perceptron  Neural  Network 


The  MLP  neural  network  was  trained  using  a  standard  back-propagation  training  method,  the 
delta  rule  algorithm  [20],  and  it  is  used  to  update  the  weights  in  this  task.  Details  of  the 
derivation  for  the  learning  (training)  and  testing  algorithms  are  not  discussed  here,  but  their 
implementation  is  described.  The  steps  of  the  training  algorithm  are  listed  in  the  table  1. 
Weights  are  initialized  with  small  random  values  and  then  each  training  sample  is  fed  into  the 
NN,  one  by  one.  For  every  sample,  the  output  error  from  the  desired  result  is  computed  and 
the  partial  derivative  of  the  error  calculated  with  respect  to  each  weight.  This  step  is 
implemented  for  all  the  training  samples  and  the  partial  derivatives  are  summed  up  together 
for  each  weight.  These  partial  derivatives  indicate  the  direction  in  which  the  particular 
weight  has  to  be  varied  to  minimize  the  total  error.  Thus,  the  weights  are  updated  using  the 
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previous  weights,  partial  derivative  and  preset  leaning  rate  ( a )  values.  The  learning  rate 
value  determines  how  far  to  move  the  weights  in  the  direction  given  by  the  partial  derivatives 
Through  empirical  testing,  a  =0.35  is  chosen  for  this  application.  If  the  learning  rate  is  too 
large  in  value  then  the  NN  will  oscillate  and  will  not  minimize  the  error.  But,  if  the  learning 
rate  is  too  small  in  value,  then  the  learning  speed  will  slow  although  the  NN  will  eventually 
converge  to  a  local  minimum. 

Table  1.  MLP  training  steps  using  back-propagation  training  method _ 

Step  1.  Initialize  weights  to  small  random  numbers 

Step  2.  Bias  (b)  and  sigmoid  function’s  slope  (s)  set  to  0.1  and  0.2,  respectively  (These  values 
are  decided  by  this  author  based  on  his  own  experience). 

Step  3.  Input  a  sample  from  the  training  set 

Step  4.  Compute  output 

Net  _inp  =  I  biwn)  +  b  ’ 
i  =  l 

_( _ 2 _ 

yj  \  +  e-\(sxNet_inpj) 

1  <  j  <  M  i 

where  i  is  the  node  of  the  previous  layer,  N  is  the  number  of  nodes  in  the  previous  layer, 
j  is  the  node  of  the  calculating  layer,  M  is  the  number  of  nodes  in  the  calculating  layer. 

If  calculating  layer  is  first  hidden  layer  then  x,  is  the  ith  node  input  feature,  otherwise  x, 

=  y.  (y.  is  the  output  of  the  i1*1  node  of  previous  layer)  and  yj  is  the  output  of  the  j*11  node 
of  the  calculating  layer. 

Step  5.  Estimate  of  weight  adjustments 

Weights  adjustment  between  last  hidden  layer  and  output  layer 

Vwjl  =  Vwjl  +  0!X$iXy  j, 

where  t|  is  the  desired  output  value,  Oj  is  the  calculated  output  value,  Si  is  the  error  at 
output  node  1,  yj  is  the  calculated  output  value  (at  node  j)  for  hidden  layers  and  Vw  ji  is 
the  weights  adjustment  from  the  hidden  layer  node  j  to  output  layer  node  1. 
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Step  5.  Estimates  of  weight  adjustments  (Continued...) 


Weights  adjustment  between  input  and  hidden  layers  or  between  hidden  layers 

Sj4-y2M}{ 

Vwy  =  Vwij  +  ccxSj'xyi > 

where  8 1  is  the  error  at  output  node  1,  8  j  is  error  at  hidden  node  j,  y,  is  the  calculated  i4 
node  output  at  hidden  layer  (if  j  is  a  node  of  the  first  hidden  layer,  then  y,  =  xi  and  x,  is 
input  feature)  and  Vwyis  the  weights  adjustment  from  node  i  to  node  j. 

Step  6.  If  all  the  samples  are  not  used  for  training,  then  select  next  sample  from  the 
training  set  and  repeat  Steps  4  and  5.  Otherwise  go  to  Step  7. 

Step  7.  Update  of  weights 

The  weights  between  the  last  hidden  layer  and  output  layer  are  determined  by 
wjl(t  +  l)  =  wjl(t)  +  Vwjl , 

whereas  the  weights  between  the  input  and  hidden  layers  or  between  adjacent  hidden 
layers  use 

wij  (t  + 1)  =  wij  (0  +  Vwy 

Step  8.  Calculate  the  number  of  targets  correctly  classified 

Using  the  updated  weights,  calculate  the  output  layer  node  values  as  in  step3.  Then 
compare  the  result  with  the  target  output. 

Count=0 

Start  p=l  and  go  until  s=M,  where  M  =  number  of  training  sample 
check=0 

Start  j=l  and  go  until  j=N,  where  N=number  of  output  nodes 
If  t!j  -  OP  >  threshold,  where  threshold  set  to  0.3 
check=l 

get  out  from  the  j  loop 

End  if 
End  loop  j 
If  Count  =  0 

Count  =  count+1 

End  if 
End  loop  p 

If  Count=M,  then  training  phase  completed  and  stop  the  training  process,  otherwise  go 
back  to  step  3. 


The  MLP  NN  is  trained  with  stopping  criterion  that  the  network  should  recognize  the  entire 
training  set  correctly.  However,  there  is  no  guarantee  in  any  case  that  the  algorithm  will  reach 
the  global  minimum  error.  The  NN  output  nodes  assigned  for  each  target  are  as  follows,  T-72 
to  node  1,  BTR-70  to  node  2  and  BMP-2  to  node  3.  For  the  training  phase,  output  values  are 
limited  to  -1  to  +1  by  sigmoid  function.  For  example,  for  a  BTR-70  image  to  be  correctly 
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recognized  output  nodes  1  and  3  must  generate  a  value  below  (-1+  threshold)  and  output  node 
2  a  value  above  (+1 -threshold).  In  the  evaluation  experiments,  the  threshold  value  is 
empirically  set  to  0.3.  The  time  taken  for  training  varies  over  a  large  period,  normally  ranging 
from  5  to  30  minutes.  In  this  application,  if  all  the  samples  were  not  learned  within  half  an 
hour,  the  training  process  should  be  restarted.  Trained  weights  used  for  this  experiment  took 
450.937  seconds  (Pentium  4  CPU  2.0  GHz  computer)  to  obtain  convergence  with  all  the 
training  samples.  The  memory  space  needed  for  storage  in  the  training  process  is  1 1,210  Kb 
(1 1,199  Kb  for  training  samples  binary  file,  3  Kb  for  initialization  text  file  and  8  Kb  for 
weights  text  file). 


Table  2.  MLP  testing  steps _ 

Step  1 .  Initialize  weights  to  previously  trained  weights 

Step  2.  Input  a  sample  from  the  testing  set 


Step  3. 


Compute  output 


For  Hidden  layer  nodes 

N  ,  v 

Net  Jnp  j  =  X  [xi  wij)  +  b  » 


»  =  1 
2 


l  +  e-l(sxNet_inp  ■■) 


-1’ 


1<  j<M  , 

where  i  is  the  node  of  the  previous  layer,  N  is  the  number  of  nodes  in  the  previous  layer, 
j  is  the  node  of  the  calculating  layer  and  M  is  the  number  of  nodes  in  the  calculating 
layer.  If  calculating  layer  is  first  hidden  layer  then  x,  is  the  i,h  node  input  feature, 
otherwise  x,  =  y,  (yj  is  the  output  of  the  i!tl  node  of  previous  layer)  and  yj  is  the  output  of 
the  jth  node  of  the  calculating  layer. 


For  output  layer  nodes 

Net  _inpj=  X  G',  wy)+  bias  ’ 
i  =  1 


Oi  = 


( 1  +  e- 1(  slope  x  Net  _  inp  ■) 


1<  j<M  , 

where  N  is  the  number  of  nodes  in  the  last  hidden  layer,  M  is  the  number  of  nodes  in  the 
output  layer,  y,  is  the  output  of  the  last  hidden  layer  node  i  and  Oj  is  the  output  of  the  j"1 
node  of  the  output  layer. 


Step  4.  If  all  the  samples  are  not  tested,  then  select  the  next  sample  from  the  testing  set  and 
repeat  the  Step  3.  Otherwise  end  the  testing  process. 


The  testing  algorithm  is  very  simple  and  the  steps  are  listed  in  table  2.  Each  test  sample  is  fed 
into  the  input  of  the  trained  NN  and  then  the  output  is  computed  using  the  predetermined 
trained  weights.  The  output  values  vary  from  0  to  2.  An  output  value  that  is  equal  to  2  means 
the  test  sample  is  similar  to  that  vehicle  type  and  0  means  it  is  not  similar.  It  is  not  practical  to 
have  the  output  values  to  be  exactly  2  or  zero,  so  it  is  necessary  to  have  some  kind  of 
threshold  range  value  to  accept  the  decision.  The  threshold  may  be  variable,  which 
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parameterizes  the  corresponding  Receiver  Operation  Characteristics  (ROC)  curve. 

Conversely,  the  threshold  may  be  chosen  from  the  ROC  curve  according  to  what  percentage 
of  detection  rate  is  needed  for  the  application,  as  described  in  the  Result  and  Discussion 
section  (section  5.0).  If  more  than  one  output  node  have  values  close  to  2  within  the  threshold 
range,  then  the  node  closest  to  2  is  selected.  If  none  of  the  output  node  values  are  close  to  2 
within  the  threshold  range,  then  the  test  sample  is  classified  as  unknown 


4.  Data  Set 


With  the  increasing  need  for  automated  exploitation  of  SAR  images  the  collection  of  ground 
truthed  data  also  increased  [4].  The  data  used  for  this  study  is  the  MSTAR  public  data  set  of 
SAR  images  collected  in  spotlight  mode  at  30  cm  resolution  [21].  The  data  was  collected  in 
September  1995,  November  1996  and  May  1997  by  the  Sandia  National  Laboratory  (SNL) 
and  released  to  the  public  by  U.S.  Air  Force  Wright  Laboratory  [21],  The  data  is  divided  into 
target  and  confuser  data  sets  according  to  the  COMPASE  Center  evaluation  criteria  [4].  Then 
the  target  set  is  divided  into  training  set  and  testing  set.  During  the  training  stage,  the 
classifier  algorithm  needs  sample  images,  along  with  known  class  type.  The  training  set  of 
this  application  has  three  target  classes,  T-72,  BMP-2  and  BTR-70.  All  training  images  are  at 
17-degree  depression  angle  and  with  full  aspect  of  coverage.  The  target  types  and  the  number 
of  samples  used  for  training  are  listed  in  the  table  3.  A  single  vehicle  of  each  vehicle  type  is 
used  to  train  the  classifier. 


Table  3.  Training  Set 


Targets  type  and  serial  number 

#  of  samples 

Comments 

T-72  (132) 

232 

All  the  targets  collected  at  17 

BTR  -  70  (c72) 

233 

degree  depression  angle,  full 

BMP  -  2  (9563) 

233 

aspect  coverage  and  30  cm 

Total  = 

:698 

resolution 

The  data  set  (Testing  set  1)  used  to  measure  the  recognition  rate  is  listed  in  the  table  4.  All 
the  vehicles  used  for  this  test  belong  to  one  of  the  defined  class  types  but  may  have  a  different 
serial  number.  As  well,  all  test  imagery  is  at  a  15-degree  depression  angle  instead  of  the  17- 
degree  depression  angle  used  for  training. 


Table  4,  Testing  set  1  -  Testing  samples  for  confusion  matrix  test 


Targets  type  and  serial  number 

#  of  samples 

Comments 

T-72  (812) 

195 

All  the  targets  collected  at 

15  degree  depression 
angle,  full  aspect  coverage 
and  30  cm  resolution 

T-72  (s7) 

191 

T-72  (132) 

196 

BTR-70  (c72) 

196 

BMP-2  (9563) 

195 

BMP-2  (9566) 

196 

BMP-2  (c21) 

196 

Total  =  1365 
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To  generate  ROC  curves,  vehicles  not  belonging  to  the  types  used  in  training  are  required  to 
measure  false  alarms.  The  confuser  images  are  also  at  the  15-degree  depression  angle.  Table 
5  shows  the  types  of  confusers  and  number  of  samples  used  to  perform  the  experiment. 


Table  5.  Testing  set  2  -  Testing  samples  for  ROC  curve  test 


Targets  type  and  serial  number 

Comments 

2S1 

274 

D7 

274 

All  the  confusers 

T62 

273 

collected  at  15  degree 

ZIL-131 

274 

depression  angle,  full 

BTR-60 

195 

aspect  coverage  and  one 

ZSU-23/4 

273 

feet  resolution 

Total 

=  1563 

Both  the  training  set  and  testing  set  samples  were  extracted  from  the  image  chips  that  were 
first  aligned  according  to  the  ground  truth  heading  and  the  central  image  block  of  64  x  64 
pixels  extracted.  The  two  hidden  layer  MLP  NN  was  trained  using  the  training  data  set 
according  to  the  procedure  outlined  in  section  3.  The  results  of  the  evaluation  experiments  are 
discussed  in  the  next  section. 


5.  Results  and  Discussion 


The  trained  NN  is  applied  to  testing  sets  1  and  2  as  described  in  section  4.  To  generate  the 
ROC  curves,  the  threshold  value  is  varied  from  0  to  2  in  increments  of  0.01.  At  each 
increment,  the  percentage  of  detection  (Pd)  and  percentage  of  the  false  alarms  (Pfa)  are 
calculated  and  then  a  graph  plotted.  The  graph  constitutes  a  “Receiver  Operating 
Characteristics  (ROC)  curve”  [22]  as  is  shown  in  the  Figure  2. 

According  to  figure  2,  for  the  same  Pd,  the  Pfa  rate  is  highest  for  the  BMP-2  while  the  BTR-70 
performance  is  better  compared  to  other  two  vehicles.  Better  classifiers  should  provide  lower 
Pfa  rates  and  higher  Pd  rates.  From  figure  2,  an  optimal  threshold  value  can  be  found  to  get  the 
best  performance  of  each  classifier  node  by  taking  the  operating  point  of  the  classifier  in  the 
ROC  curve  closest  to  the  point  (0,1),  i.e.,  the  left  upper  comer.  For  this  case,  the  threshold 
values  are  0.03  for  T-72, 0.01  for  BTR-70  and  0.12  for  BMP-2.  Using  these  thresholds, 
correct  classification  rates  are  calculated  for  testing  set  1,  as  shown  in  table  6.  Asterisks 
indicate  the  specific  vehicles  that  appear  in  both  training  and  testing  sets.  The  results  obtained 
for  these  vehicles  is  higher  than  that  of  the  results  obtained  with  other  vehicles  of  same  type, 
as  expected. 
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Figure  2.  ROC  curves  of  the  MLP  NN  classifier,  using  the  test  set  against  the  six  confusers. 


Table  6.  MLP  Confusion  matrix  (At  high  Pd  rate  and  low  Pfa  rate) 


T-72 

BTR-70 

BMP-2 

Pcclci  (%) 

T-72(812) 

127 

8 

16 

44 

84.11 

T-72(S7) 

112 

11 

30 

38 

73.20 

171 

0 

6 

19 

96.61 

EESSSBIM 

3 

178 

0 

15 

98.34 

BUM 

3 

2 

168 

22 

97.11 

BMP-2(9566) 

29 

10 

121 

36 

75.63 

BMP-2(C21) 

18 

2 

150 

26 

88.24 

Pcc|d  =  88.15% 

Keeping  the  same  thresholds,  the  misclassification  and  rejection  rates  are  determined  using 
testing  set  2.  Misclassification  for  each  confuser  vehicle  is  calculated  by  dividing  the  number 
of  vehicle  images  misclassified  by  the  total  number  of  vehicle  images  tested.  The  results  as 
listed  in  table  7,  show  the  misclassification  rate  is  considerably  higher  than  the  rejection  rate 
on  some  of  the  vehicles.  For  instance,  the  D7  is  confused  for  a  BMP-2  and  the  T-62  is  often 
confused  for  a  T-72. 
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Table  7.  Misclassification  rate  (%)  and  confuser  rejection  rate  (%)  (At  high  Pd  rate  and  low 


Pfa  rate) 


T-72 

BTR-70 

BMP-2 

2S1 

12.77 

28.47 

31.75 

27.01 

D7 

5.47 

0.36 

85.77 

8.39 

T62 

49.27 

6.20 

17.15 

27.37 

ZIL-131 

22.63 

22.26 

28.83 

26.28 

BTR-60 

18.97 

23.59 

22.05 

35.38 

ZSU-23/4 

42.86 

0.00 

30.77 

26.37 

Typically,  the  evaluation  experiment  is  done  with  Pd  set  to  0.9.  The  thresholds  are  determined 
for  each  class  from  the  ROC  curves  giving  0.14  for  the  T-72,  0.14  for  the  BTR-70  and  0.14 
for  the  BMP-2,  with  the  results  is  listed  in  table  8.  The  over  all  Pcc|d  and  rejection  rates  are 
decreased  and  the  misclassification  rate  is  increased  compared  to  the  previous  experiment  as 
shown  in  table  9.  Noticeably,  none  of  the  ZSU-23/4  targets  were  misclassified  as  BTR-70  at 
previous  threshold  setting  and  only  0.37%  misclassified  at  this  threshold  setting,  indicating 
the  features  extracted  in  this  case  discriminate  well  between  those  two  classes. 


Table  8.  MLP  Confusion  matrix  (Pd  to  0.9) 


T-72 

BTR-70 

BMP-2 

Pccid  (%) 

T-72(812) 

141 

13 

15 

26 

83.43 

■EBE&m 

122 

13 

30 

26 

73.94 

wmm imm 

177 

3 

5 

11 

95.68 

TSggBBHiEIBl 

3 

183 

0 

10 

98.39 

B3Siiaafefa»HB 

5 

5 

170 

15 

94.44 

BMP-2(9566) 

31 

16 

121 

28 

72.02 

BMP-2(C21) 

23 

4 

152 

17 

84.92 

Pccid  =  85.20% 

Table  9.  MLP  Misclassification  rate  (%)  and  confuser  rejection  rate  (%)  (Pd  to  0.9) 


T-72 

BTR-70 

BMP-2 

2S1 

16.42 

33.94 

31.75 

17.88 

D7 

8.03 

0.73 

86.86 

4.38 

T62 

54.95 

9.16 

17.58 

18.32 

ZIL-131 

25.55 

27.01 

28.83 

18.61 

BTR-60 

24.62 

28.72 

22.56 

24.10 

ZSU-23/4 

52.75 

0.37 

31.14 

15.75 

Next,  the  experiment  is  continued  without  any  thresholding  on  the  output  nodes.  This  raises 
the  false  alarm  rate  to  100%.  By  doing  this,  each  image  is  forced  to  be  classified  as  one  of  the 
trained  classes  of  vehicles.  First,  the  output  is  computed  for  a  test  image,  and  then  the  error  is 
calculated  for  each  output  node.  After  that,  the  image  is  classified  to  the  class  that  contains 
the  minimum  error.  Therefore  testing  set  1  is  used  for  this  experiment  and  there  is  no  need  to 
use  testing  set  2  except  to  determine  which  target  types  each  confuser  is  most  similar.  The 
percent  of  correct  classification  of  declared  targets  is  shown  in  table  10. 
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Table  10.  MLP  confusion  matrix  -  does  not  reject  any  vehicle  (100%  false  alarm  rate) 


T-72 

BTR-70 

BMP-2 

P ccld  (%) 

T-72  (812) 

152 

16 

27 

77.95 

T-72  (s7) 

135 

15 

41 

70.68 

T-72  (132) 

185 

4 

7 

94.39 

BTR-70  (c72) 

3 

189 

4 

96.43 

BMP-2  (9563) 

6 

6 

183 

93.85 

BMP-2  (9566) 

41 

16 

139 

70.92 

27 

7 

162 

82.65 

Once  again,  the  advantage  of  using  a  neural  network  classifier  is  that  it  consumes  less 
memory  than  many  other  methods  and  makes  decisions  very  quickly.  The  memory  space 
needed  for  this  classifier  is  as  little  as  13  KB  and  the  speed  of  testing  one  image  is  16  ms  on  a 
Pentium  4  CPU  2.0  GHz  computer  using  Matlab. 


6.  Conclusion 


Here,  a  four  layer  MLP  NN  is  implemented  to  classify  three  ground  target  vehicle  types  from 
SAR  imagery.  The  NN  is  trained  using  a  back-propagation  algorithm  on  imagery  with  17- 
degree  depression  angle.  For  testing,  data  with  a  15-degree  depression  angle  is  used. 
Standard  evaluation  methods  for  the  MSTAR  data,  using  Receiver  Operating  Characteristic 
curves  and  confusion  matrices,  are  used  to  evaluate  the  MLP  neural  network  classifier.  The 
MLP  NN  classifier  produces  the  result  very  quickly  (16  ms  per  image  on  a  Pentium  4  CPU 
2.0  GHz  computer  using  Matlab)  and  consumes  small  amount  of  memory  space  (13  KB). 
From  the  ROC  curve,  MLP  NN  classifier  performs  much  better  than  a  random  classifier  [5]. 
It  is  possible  to  increase  the  classification  performance  capacity  and  decrease  false  alarm  by 
revisiting  signature  extraction,  MLP  NN  architecture,  choosing  data  for  training  set,  etc. 
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